The Essential Dynamics Algorithm: Essential Results
نویسنده
چکیده
This paper presents a novel algorithm for learning in a class of stochastic Markov decision processes (MDPs) with continuous state and action spaces that trades speed for accuracy. A transform of the stochastic MDP into a deterministic one is presented which captures the essence of the original dynamics, in a sense made precise. In this transformed MDP, the calculation of values is greatly simplified. The online algorithm estimates the model of the transformed MDP and simultaneously does policy search against it. Bounds on the error of this approximation are proven, and experimental results in a bicycle riding domain are presented. The algorithm learns near optimal policies in orders of magnitude fewer interactions with the stochastic MDP, using less domain knowledge. All code used in the experiments is available on the project’s web site. This work was funded by DARPA as part of the "Natural Tasking of Robots Based on Human Interaction Cues" project under contract number DABT 63-00-C-10102.
منابع مشابه
Comparative antimicrobial efficacy, kinetic destruction pattern and microbial inactivation dynamics of extracted cinnamon essential oil and commercial cinnamaldehyde against food borne pathogens
Background and Objective: The increasing demand for the discovery of next-generation antimicrobials necessitates the use of plant extracts as alternatives. This study investigates the antibacterial efficacy of extracted cinnamon essential oil (CEO) and commercial cinnamaldehyde (CN) against foodborne pathogens. Methods: Kirby-Bauer disc diffusion method was used to screen the antimicrobial pot...
متن کاملEssential Oil of Aaronsohnia Pubescens Subsp. Pubescens as Novel Eco-Friendly Inhibitor for Mild Steel in 1.0 M HCl
The essential oil from the aerial parts of Aaronsohnia pubescens subsp. pubescens plant (APS oil) was extracted by hydrodistillation, and then its composition was analyzed by gas chromatography (GC) and GC-mass spectrometry (GC/MS). We identified thirty-four constituents presenting 87% of the total amount, Which, Carvacrol (13.9%), α-Pinene (10.3 %), E-Anethole (10.1%) and Ar-Turmerone (9.3%) w...
متن کاملMulticomponent Distillation Modeling of An Essential Oil by the SRK and PSRK State Equations
The equation of state Soave-Redlich-Kwong (SRK) and its modification (predictive SRK or PSRK) are applied to simulate multicomponet distillation, which separate main component of spearmint essential oil. The simulation model is based on bubble point method, and the Wang-Henke algorithm. Spearmint essential oil is considered in the study and the original experimental data were obtained f...
متن کاملMolecular Dynamics Investigation of The Elastic Constants and Moduli of Single Walled Carbon Nanotubes
Determination of the mechanical properties of carbon nanotubes is an essential step in their applications from macroscopic composites to nano-electro-mechanical systems. In this paper we report the results of a series of molecular dynamics simulations carried out to predict the elastic constants, i.e. the elements of the stiffness tensor, and the elastic moduli, namely the Young’s and shear mod...
متن کاملDynamics of Space Free-Flying Robots with Flexible Appendages
A Space Free-Flying Robot (SFFR) includes an actuated base equipped with one or more manipulators to perform on-orbit missions. Distinct from fixed-based manipulators, the spacecraft (base) of a SFFR responds to dynamic reaction forces due to manipulator motions. In order to control such a system, it is essential to consider the dynamic coupling between the manipulators and the base. Explicit d...
متن کاملAerodynamic Design Optimization Using Genetic Algorithm (RESEARCH NOTE)
An efficient formulation for the robust shape optimization of aerodynamic objects is introduced in this paper. The formulation has three essential features. First, an Euler solver based on a second-order Godunov scheme is used for the flow calculations. Second, a genetic algorithm with binary number encoding is implemented for the optimization procedure. The third ingredient of the procedure is...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003